Intrinsically Disordered Protein

It has long been taught that proteins must be properly folded in order to perform their functions. This paradigm derives from work by Christian B. Anfinsen and coworkers. In the 1960's, they showed that RNAse, when denatured so that 99% of its enzymatic activity was lost, could regain enzymatic activity within seconds when the denaturing agent was removed under proper conditions. They concluded that the amino acid sequence is sufficient for a protein to fold into its functional, lowest energy conformation. This work won the 1972 Nobel Prize, and was subsequently confirmed and extended by many researchers.

Recently, it has been recognized that not all proteins function in a folded state. Some proteins must be unfolded or disordered in order to perform their functions, and others fold only in complex with target structures. These are termed intrinsically disordered protein (IDP), intrinsically unstructured protein (IUP), or natively unfolded protein.

By some estimates, about 10% of all proteins are fully disordered, and about 40% of eukaryotic proteins have at least one long (>50 amino acids) disordered loop. Such sequences, under physiological conditions in vitro, display physicochemical characteristics resembling those of random coils. They possess little or no ordered structure, having instead an extended conformation with high intra-molecular flexibility, lacking any tightly packed core.

Many crystallographic structures have missing loops -- that is, ranges of amino acids with no atomic coordinates in the model. These &quot;gaps&quot; in the model are often thought to be artifacts of inadvertant disorder in the crystal. In some cases, these gaps may be alerting us to the presence of intrinsically disordered loops in an otherwise folded protein. Such gaps are the basis for the DISOPRED2 disorder prediction server. FirstGlance in Jmol offers one method for locating and visualizaing such gaps.

Despite the existence of compelling evidence for IUPs and intrinsically disordered loops beginning in 1990, many current textbooks of biochemistry and even some monographs on protein structure fail to mention intrinsic disorder and its importance for protein function. In 2011, Chouard provided a readable and informative overview of IUPs and how some of them function.

Examples of IUPs
Examples cover a wide variety of cellular systems and it has been predicted that eukaryotes have more IUPs than other kingdoms. Of course, there are no PDB codes for fully disordered proteins in isolation. However, there are some crystallographic results for IUP that undergo disorder-order transition when they complex with another folded protein domain, such as 1jsu, 1g3j, and 1oct. Other examples are at Globular_Proteins. See further information about 1jsu and other cases below.

IUPs play roles in processes such as:
 * Cell signaling and cell cycle regulation, e.g. cyclin dependent kinase inhibitor p21Waf1/Cip1/Sdi1


 * Oncogene, e.g. P53 contains large unstructured regions in its native state


 * Assembly of cytoskeletal proteins, e.g. Tau protein
 * Membrane fusion and membrane transport, e.g. isolated components of the SNARE complex
 * DNA recognition molecules, e.g. the basic DNA-binding region of the leucine zipper protein, GCN4
 * Protein-RNA recognition, e.g. ribosomal proteins, such as L11-C76 and several others


 * Transcriptional activation domains, e.g. NF-kb, Glucocorticoid receptor, 77-262 fragment
 * Amyloid formation, e.g. prion protein, N terminal part, NACP precursor of the non-Ab component of the amyloid plaque

Principles Used in Prediction
Led by the assumption that “since amino acid sequence determines 3-D structure, amino acid sequence should also determine lack of 3-D structure” specific sequence features shared by IUPs have been evaluated and algorithms for their identification formulated.

The low hydrophobicity and high net charge of natively unfolded proteins result in a difference in amino acid composition between them and natively folded proteins. Compared to sequences of ordered proteins, disordered protein sequences are substantially depleted in I, L, V, W, F, Y, and C, which were therefore designated as “order promoting” amino acids, and enriched in E, K, R, G, Q, S, P, and A, which have been designated as “disorder promoting”. The under representation of hydrophobic amino acids in a protein diminishes one of the basic thermodynamic forces known to be important for protein folding, namely, the hydrophobic interaction. Because a hydrophobic core does not form, such proteins have large hydrodynamic dimensions.

Prediction Servers
The quality of predictions by various algorithms have been evaluated beginning in CASP5 (2002). The assessment of disorder predictions for CASP8 (2008) has been published.


 * DISOPRED2 (Jones Group, University College London, UK). "DISOPRED2 was trained on a set of around 750 non-redundant sequences with high resolution X-ray structures. Disorder was identified with those residues that appear in the sequence records but with coordinates missing from the electron density map. This is an imperfect means for identifying disordered residues as missing co-ordinates can also arise as an artifact of the crystalization process. False assignment of order can also occur as a result of stabilizing interactions by ligands or other macromolecules in the complex. However, this is the simplest means for defining disorder in the absence of further experimental investigation of the protein." (Quoted from the DISOPRED2 website.)


 * FoldIndex (Sussman Group, Weizmann Institute, Rehovot, Israel). FoldIndex makes predictions based on the observation that IUPs occupy the low hydrophobicity/ high net-charge portion of charge-hydrophobicity phase space. (See Figure above.)


 * IUPred (Dosztányi, Csizmók, Tompa and Simon: Budapest, Hungary). "IUPred recognized intrinsically unstructured regions from the amino acid sequence based on the estimated pairwise energy content. The underlying assumption is that globular proteins are composed of amino acids which have the potential to form a large number of favorable interactions, whereas intrinsically unstructured proteins (IUPs) adopt no stable structure because their amino acid composition does not allow sufficient favorable interactions to form." (Quoted from the IUPred website.)


 * PONDR (Dunker Group, Indiana University and Molecular Kinetics, Inc., Indianapolis IN USA; Obradovic Group, Temple Univ., Philadelphia PA USA). "PONDR® functions from primary sequence data alone. The predictors are feedforward neural networks that use sequence information from windows of generally 21 amino acids. Attributes, such as the fractional composition of particular amino acids or hydropathy, are calculated over this window, and these values are used as inputs for the predictor. The neural network, which has been trained on a specific set of ordered and disordered sequences, then outputs a value for the central amino acid in the window. The predictions are then smoothed over a sliding window of 9 amino acids. If a residue value exceeds a threshold of 0.5 (the threshold used for training) the residue is considered disordered." (Quoted from the PONDR website.)


 * WinDiso (Grishin Lab, Dallas, Texas USA). "WinDiso is a linear, sequence- and alignment-based predictor of disordered/unfolded regions in proteins. It has the capability of adjusting for the increased tendency for disorder at protein termini. The simple weighted window-based algorithm and careful optimization technique make this a good predictor to use when trying to avoid bias toward special cases." (Quoted from the Grishin lab website.)

''The above list is incomplete. Addition of other servers is welcome, and summaries of methods, pros and cons for each server would be useful.''

Biological implications of IUPs
It was proposed that the unfolded nature of the IUPs provides them with advantages in recognition and binding. Although their large hydrodynamic dimensions slow down diffusion, their size provides a large target for initial molecular collisions, and the lack of rigid binding pockets permits multiple approach orientations for a binding partner, which may increase the probability of productive interactions. In addition, IUPs allow molecular plasticity by adopting more than one conformation and binding diversity by binding to several proteins and thus many of the known hub proteins are IUPs. IUPs rapid turnover in the cell allow their tight regulation as many times needed in cell signaling and cell cycle.

Many IUPs undergo disorder-order transition
Binding of natural ligands such as a variety of small molecules, substrates, cofactors, other proteins, nucleic acids or membranes may induce folding of unstructured proteins. In addition to the cases detailed below, other examples include 1g3j, 1oct, and the Lac repressor.

The human p27Kip1 kinase inhibitory domain


The cyclin-dependent kinases (CDKs) have a central role in coordinating the eukaryotic cell division cycle. CDKs are controlled through several different processes involving the binding of activating cyclin subunits. Complexes of cyclins with CDKs play a central role in the control of the eukaryotic cell cycle. These complexes are inhibited by other proteins termed in general cyclin-CDK inhibitors (CKIs). One example of CKIs is p27Kip1. p27Kip1 is an IUP and it binds to phosphorylated cyclin/CDK complex in an extended conformation interacting with both cyclin A and CDK2. On cyclin A, it binds in a groove formed by conserved cyclin box residues. On CDK2, it binds and rearranges the amino-terminal lobe and also inserts into the catalytic cleft, mimicking ATP. []

The transcriptional activator GCN4


The yeast transcriptional activator GCN4 belongs to a large family of eukaryotic transcription factors including Fos, Jun and CREB. All family members have a DNA recognition motif consists of a coiled-coil dimerization element, the leucine-zipper, and an adjoining basic region, which mediates DNA binding. This basic region is largely unstructured in the absence of DNA, addition of DNA containing a GCN4 binding site induce the transition of this region from unstructured to α-helical.

Authorship
This article was written by Tzviya Zeev-Ben-Mordehai. Contributions by Eric Martz were minor -- his name is listed first due to a technicality.